pacman::p_load(tidyverse,
               ggplot2,
               sf,
               sp,
               tmap,
               leaflet)

Why should you use maps?

Maps are used in a variety of fields to express data in an appealing and interpretive way. Data can be expressed into simplified patterns, and this data interpretation is generally lost if the data is only seen through a spread sheet. Maps can add vital context by incorporating many variables into an easy to read and applicable context. Maps are also very important in the information world because they can quickly allow the public to gain better insight so that they can stay informed. It’s critical to have maps be effective, which means creating maps that can be easily understood by a given audience.

Basic elements of a map that should be considered are polygon, points, lines, and text.

  • Polygons, on a map, are closed shapes such as country borders.
  • Lines are considered to be linear shapes that are not filled with any aspect, such as highways, streams, or roads.
  • Finally, points are used to specify specific positions, such as city or landmark locations.

Using R to create maps brings these benefits to mapping. Elements of a map can be added or removed with ease — R code can be tweaked to make major enhancements with a stroke of a key. It is also easy to reproduce the same maps for different data sets.

The package ggplot2 implements the grammar of graphics in R, as a way to create code that make sense to the user: The grammar of graphics is a term used to breaks up graphs into semantic components, such as geometries and layers. Practically speaking, it allows (and forces!) the user to focus on graph elements at a higher level of abstraction, and how the data must be structured to achieve the expected outcome. Recently, the package ggplot2 has allowed the use of simple features from the package sf as layers in a graph1. The combination of ggplot2 and sf therefore enables to programmatically create maps, using the grammar of graphics.

Introduction to tmap

With the tmap package, thematic maps can be generated with great flexibility. The syntax for creating plots is similar to that of ggplot2, but tailored to maps.

First steps

A good place to start is to create a map of the world.

The object World is a spatial object of class sf from the sf package; it is a data.frame with a special column that contains a geometry for each row, in this case polygons. In order to plot it in tmap, you first need to specify it with tm_shape. Layers can be added with the + operator, in this case tm_polygons. There are many layer functions in tmap, which can easily be found in the documentation by their tm_ prefix.

data("World")
class(World)
## [1] "sf"         "data.frame"

The “World”-DataFrame includes informations on the population, the economy and so on of all countries in the world.

World %>% as.data.frame() %>% head()

After installing tmap, the following lines of code should create the map shown below:

tmap_mode("plot")
tm_shape(World) +
    tm_polygons("HPI")

Multiple shapes and layers

A shape is a spatial object (with a class from sf, sp, stars, or raster). Multiple shapes and also multiple layers per shape can be plotted.

data(World, metro)

Let’s have a look into the metro-Data Frame. It shows the population of the metropolitan regions in the world, for example all metropolitan regions of Switzerland, i.e. Zurich.

metro %>% as.data.frame() %>% dplyr::filter(iso_a3=="CHE") %>% head()

Let’s look at the whole world population in 1950 and 2010.

tmap_mode("plot")
m1<-tm_shape(metro)+tm_bubbles("pop1950")
m2<-tm_shape(metro)+tm_bubbles("pop2010")
tmap_arrange(m1,m2,nrow=1)

tmap_mode("plot")
## tmap mode set to plotting
#tm_shape(land) +
    #tm_raster("elevation", palette = terrain.colors(4)) +
tm_shape(World) +
    tm_borders("gray", lwd = .5) +
    tm_text("iso_a3", size = "AREA") +
tm_shape(metro) +
    tm_symbols(col = "red", size = "pop2020", scale = .5) +
tm_legend(show = T)

A much better picture can be obtained if the information on the population is presented in a somewhat better graphic form and supplemented by a growth rate of the population figures. So, let’s add some color and a nicer looking legend.

# Add the column growth into metro dataset
metro$growth <- (metro$pop2020 - metro$pop2010) / (metro$pop2010 * 10) * 100
tm_shape(metro) +
tm_bubbles("pop2010", col = "growth", 
 breaks=c(-Inf, seq(0, 6, by=2), Inf),
               palette="-RdYlBu", contrast=1, 
               title.size="Metro population", 
               title.col="Growth rate (%)", id="name")+
  tm_layout(legend.outside = T)

Now we can simply put the two maps (the happiness index world map an the population growth bubbles) on top of each other and get a simple yet informative map.

data(World, metro)
metro$growth <- (metro$pop2020 - metro$pop2010) / (metro$pop2010 * 10) * 100

## Define global style and format for map
tmap_style("gray")
tmap_format("World_wide")
## $inner.margins
## [1] 0.000 0.200 0.025 0.010
## 
## $legend.position
## [1] "left"   "bottom"
## 
## $attr.position
## [1] "right"  "bottom"
## 
## $scale
## [1] 0.8
mapWorld <- tm_shape(World) +
    tm_polygons("income_grp", palette="-Blues", contrast=.7, id="name", title="Income group") +
    tm_shape(metro) +
    tm_bubbles("pop2010", col = "growth", 
               border.col = "black", border.alpha = .5, 
               style="fixed", breaks=c(-Inf, seq(0, 6, by=2), Inf),
               palette="-RdYlBu", contrast=1, 
               title.size="Metro population", 
               title.col="Growth rate (%)", id="name")

mapWorld

So this is our first static map. You can’t see details, but you can already see a lot of things from an overview point of view. But if you want to look at details of individual regions or metropolises, you need something more, namely an interactive version, which allows you to zoom into your point of interest.

This is very easy with the tmap package. The only thing to do is to change from the plot mode to the view mode. Afterwards we easyly plot the existing map a second time. See more of this in the Interactive maps section.

# set mode to view
# Play with the layers
tmap_mode("view")
mapWorld

Quick static maps with plot()

Static maps are the most common type of visual output from geocomputation. Standard formats include .png and .pdf for raster and vector outputs respectively.

The generic plot() function is often the fastest way to create static maps from vector and raster spatial objects. Sometimes, simplicity and speed are priorities, especially during the development phase of a project, and this is where plot() excels. The base R approach is also extensible, with plot() offering dozens of arguments.

# Plotting columns 3 to 6 from the World data frame.
plot(World[3:6])

Plots are added as layers to existing images by setting add = TRUE. The subsequent code chunk combines countries in Asia. The function st_union() combine or union feature geometries

world_asia = World[World$continent == "Asia", ] #Filtering all the rows from Asia continent
asia = st_union(world_asia) 
#Combining features from world_asia into a single geometry with resolved boundaries

We can now plot the Asian continent over a map of the world. Note that the first plot must only have one facet for add = TRUE to work. If the first plot has a key, reset = FALSE must be used:

# We need to run both lines to work.
plot(World["pop_est"], reset = FALSE)
plot(asia, add = TRUE, col = "red")

Adding layers in this way can be used to verify the geographic correspondence between layers: the plot() function is fast to execute and requires few lines of code, but does not create interactive maps with a wide range of options.

There are various ways to modify maps with sf’s plot() method. Because sf extends base R plotting methods plot()’s arguments such as main = (which specifies the title of the map) work with sf objects.

This figure illustrates this flexibility by overlaying circles, whose diameters (set with cex =) represent country populations, on a map of the world

plot(World["continent"], reset = FALSE)
cex = sqrt(World$pop_est) / 10000
world_cents = st_centroid(World, of_largest = TRUE)
plot(st_geometry(world_cents), add = TRUE, cex = cex)

tmap in depth

Adding layers to maps

# Back to default tmap style.
tmap_style()

Like ggplot2, tmap is based on the idea of a ‘grammar of graphics’ See ggplot section in R Basics. This involves a separation between the input data and the aesthetics (how data are visualized): each input dataset can be ‘mapped’ in a range of different ways including location on the map (defined by data’s geometry), color, and other visual variables.

The basic building block is tm_shape() (which defines input data, raster and vector objects), followed by one or more layer elements such as tm_fill() and tm_dots()

tmap_mode("plot")
library(spData)
library(spDataLarge)
library(raster)
# Add fill layer to nz shape
m1<-tm_shape(nz) +
  tm_fill() 
# Add border layer to nz shape
m2<-tm_shape(nz) +
  tm_borders() 
# Add fill and border layers to nz shape
m3<-tm_shape(nz) +
  tm_fill() +
  tm_borders() 
tmap_arrange(m1,m2,m3,nrow=1)

The object passed to tm_shape() in this case is nz, an sf object representing the regions of New Zealand. Layers are added to represent nz visually, with tm_fill() and tm_borders() creating shaded areas (left panel) and border outlines (middle panel).

This is an intuitive approach to map making: the common task of adding new layers is undertaken by the addition operator +, followed by tm_*(). The asterisk (*) refers to a wide range of layer types which have self-explanatory names including fill, borders (demonstrated above), bubbles, text and raster.

Pro tip: qtm() is a handy function to create quick thematic maps (hence the snappy name). It is concise and provides a good default visualization in many cases: qtm(nz), for example, is equivalent to tm_shape(nz) + tm_fill() + tm_borders(). The disadvantage is that it makes aesthetics of individual layers harder to control. Also tm_polygons() condenses tm_fill() + tm_borders()

m4<-qtm(nz)
m5<-tm_shape(nz) + tm_polygons()
tmap_arrange(m4,m5,nrow=1)

Notice that when storing the maps into objects in R we create a tmap object, which can be used later or add layers on top.

data(nz)
map_nz = tm_shape(nz) + tm_polygons()
class(map_nz)
## [1] "tmap"

New shapes can be added with + tm_shape(new_obj). In this case new_obj represents a new spatial object to be plotted on top of preceding layers. When a new shape is added in this way, all subsequent aesthetic functions refer to it, until another new shape is added. This syntax allows the creation of maps with multiple shapes and layers.

Example: the function tm_raster() to plot a raster layer (with alpha set to make the layer semi-transparent)

map_nz1 = map_nz +
  tm_shape(nz_elev) + tm_raster(alpha = 0.7)
map_nz1

Building on the previously created map_nz object, the preceding code creates a new map object map_nz1 that contains another shape (nz_elev) representing average elevation across New Zealand.

Aesthetics

The plots in the previous section demonstrate tmap’s default aesthetic settings:
- Gray shades are used for tm_fill() and tm_bubbles() layers - Continuous black line is used to represent lines created with tm_lines().

There are two main types of map aesthetics: those that change with the data and those that are constant. Unlike ggplot2, which uses the helper function aes() to represent variable aesthetics, tmap accepts aesthetic arguments that are either variable fields (based on column names) or constant values.

The most commonly used aesthetics for fill and border layers include color, transparency, line width and line type, set with col, alpha, lwd, and lty arguments, respectively.

m1 = tm_shape(nz) + tm_fill(col = "red")
m2 = tm_shape(nz) + tm_fill(col = "red", alpha = 0.3)
m3 = tm_shape(nz) + tm_borders(col = "blue")
m4 = tm_shape(nz) + tm_borders(lwd = 3)
m5 = tm_shape(nz) + tm_borders(lty = 2)
m6 = tm_shape(nz) + tm_fill(col = "red", alpha = 0.3) +
  tm_borders(col = "blue", lwd = 3, lty = 2)
tmap_arrange(m1, m2, m3, m4, m5, m6)

Notice that like base R plots, arguments defining aesthetics can also receive variables. But unlike R base tmap aesthetic arguments will not accept a numeric vector:

# plot(st_geometry(nz), col = nz$Land_area)  # works
# tm_shape(nz) + tm_fill(col = nz$Land_area) # fails
# Error: Fill argument neither colors nor valid variable name(s)

Instead col (and other aesthetics that can vary such as lwd for line layers and size for point layers) requires a character string naming an attribute associated with the geometry to be plotted.

tm_shape(nz) + tm_fill(col = "Land_area")

An important argument in functions defining aesthetic layers such as tm_fill() is title, which sets the title of the associated legend. The following code chunk demonstrates this functionality by providing a more attractive name than the variable name Land_area (note the use of expression() to create superscript text):

legend_title = expression("Area (km"^2*")")
map_nza = tm_shape(nz) +
  tm_fill(col = "Land_area", title = legend_title) + tm_borders()
map_nza

Color settings

m1<-tm_shape(nz) + tm_polygons(col = "Median_income") # default
breaks = c(0, 3, 4, 5) * 10000
m2<-tm_shape(nz) + tm_polygons(col = "Median_income", breaks = breaks) # breaks
m3<-tm_shape(nz) + tm_polygons(col = "Median_income", n = 10) # n
m4<-tm_shape(nz) + tm_polygons(col = "Median_income", palette = "BuGn") # pallette
tmap_arrange(m1, m2, m3, m4)

Another way to change color settings is by altering color break (or bin) settings. In addition to manually setting breaks tmap allows users to specify algorithms to automatically create breaks with the style argument. Here are six of the most useful break styles:

  • style = “pretty”, the default setting, rounds breaks into whole numbers where possible and spaces them evenly;

  • style = “equal” divides input values into bins of equal range and is appropriate for variables with a uniform distribution (not recommended for variables with a skewed distribution as the resulting map may end-up having little color diversity);

  • style = “quantile” ensures the same number of observations fall into each category (with the potential downside that bin ranges can vary widely);

  • style = “jenks” identifies groups of similar values in the data and maximizes the differences between categories;

  • style = “cont” (and “order”) present a large number of colors over continuous color fields and are particularly suited for continuous rasters (“order” can help visualize skewed distributions);

  • style = “cat” was designed to represent categorical values and assures that each category receives a unique color.

m1<-tm_shape(nz) + tm_polygons(col = "Median_income",style="pretty") 
m2<-tm_shape(nz) + tm_polygons(col = "Median_income",style="equal") 
m3<-tm_shape(nz) + tm_polygons(col = "Median_income",style="quantile") 
m4<-tm_shape(nz) + tm_polygons(col = "Median_income",style="jenks") 
m5<-tm_shape(nz_elev) + tm_raster(col = "elevation",style="cont") 
m6<-tm_shape(nz) + tm_polygons(col = "Island",style="cat") 
tmap_arrange(m1, m2, m3, m4,m5,m6,nrow=2)

Notes on palletes

Palettes define the color ranges associated with the bins and determined by the breaks, n, and style arguments described above. The default color palette is specified in tm_layout() however, it could be quickly changed using the palette argument. It expects a vector of colors or a new color palette name, which can be selected interactively with tmaptools::palette_explorer(). You can add a - as prefix to reverse the palette order.

There are three main groups of color palettes:

  1. Categorical palettes consist of easily distinguishable colors and are most appropriate for categorical data without any particular order such as state names or land cover classes.
  2. Sequential palettes.These follow a gradient, for example from light to dark colors (light colors tend to represent lower values), and are appropriate for continuous (numeric) variables.
  3. Diverging palettes. Typically range between three distinct colors and are usually created by joining two single-color sequential palettes with the darker colors at each end.

Color palettes should also be easy to understand to effectively convey information. It should be clear which values are lower and which are higher, and colors should change gradually.

Layouts

The map layout refers to the combination of all map elements into a cohesive map. Map elements include among others the objects to be mapped, the title, the scale bar, margins and aspect ratios, while the color settings covered in the previous section relate to the palette and break-points used to affect how the map looks.

Additional elements such as north arrows and scale bars have their own functions: tm_compass() and tm_scale_bar().

map_nz + 
  tm_compass(type = "8star", position = c("left", "top")) +
  tm_scale_bar(breaks = c(0, 100, 200), text.size = 1)

m1<- map_nz + tm_layout(title = "New Zealand")
m2<- map_nz + tm_layout(scale = 5)
m3<- map_nz + tm_layout(bg.color = "lightblue")
m4<- map_nz + tm_layout(frame = FALSE)
tmap_arrange(m1, m2, m3, m4)

Here are some useful layout settings

  • Frame width (frame.lwd) and an option to allow double lines (frame.double.line)

  • Margin settings including outer.margin and inner.margin

  • Font settings controlled by fontface and fontfamily

  • Legend settings including binary options such as legend.show (whether or not to show the legend) legend.only (omit the map) and legend.outside (should the legend go outside the map?), as well as multiple choice settings such as legend.position

  • Default colors of aesthetic layers (aes.color), map attributes such as the frame (attr.color)

  • Color settings controlling sepia.intensity (how yellowy the map looks) and saturation (a color-grayscale)

m1 <- tm_shape(nz) + tm_polygons(col = "Median_income")
m2 <- m1 + tm_layout(inner.margins = 0.2)
m2 <- m1 + tm_layout(legend.show = F)
m3 <- m1 + tm_layout(legend.position = c("right","bottom"))
m4 <- m1 + tm_layout(frame.lwd = 5)
m5 <- m1 + tm_layout(sepia.intensity = 2)
tmap_arrange(m2, m3, m4, m5)

Beyond the low-level control over layouts and colors, tmap also offers high-level styles, using the tm_style() function (representing the second meaning of ‘style’ in the package). Some styles such as tm_style(“cobalt”) result in stylized maps, while others such as tm_style(“gray”) make more subtle changes.

m1 <-  map_nza + tm_style("bw")
m2 <- map_nza + tm_style("classic")
m3 <- map_nza + tm_style("cobalt")
m4 <- map_nza + tm_style("col_blind")
tmap_arrange(m1, m2, m3, m4)

Pro tip: A preview of predefined styles can be generated by executing tmap_style_catalogue(). This creates a folder called tmap_style_previews containing nine images. Takes some time to run.

Faceted maps

Faceted maps, also referred to as ‘small multiples’, are composed of many maps arranged side-by-side, and sometimes stacked vertically. Facets enable the visualization of how spatial relationships change with respect to another variable, such as time.

Typically all individual facets in a faceted map contain the same geometry data repeated multiple times, once for each column in the attribute data. However, facets can also represent shifting geometries such as the evolution of a point pattern over time, for example:

The dataset urban_agglomerations includes only the top 30 largest areas by population at 5 year intervals. We use the %n% operator to identify if an element belongs to a vector.

head(urban_agglomerations)
urb_1970_2030 = urban_agglomerations %>% 
  filter(year %in% c(1970, 1990, 2010, 2030))
head(urb_1970_2030)
tm_shape(world) +
  tm_polygons()

tm_shape(world) +
  tm_polygons() +
  tm_shape(urb_1970_2030) +
  tm_symbols(col = "black", border.col = "white", size = "population_millions")

tm_shape(world) +
  tm_polygons() +
  tm_shape(urb_1970_2030) +
  tm_symbols(col = "black", border.col = "white", size = "population_millions") +
  tm_facets(by = "year", nrow = 2, free.coords = F)

The preceding code chunk demonstrates key features of faceted maps created with tmap:

  • Shapes that do not have a facet variable are repeated (the countries in world in this case)

  • The by argument which varies depending on a variable (year in this case).

  • The nrow/ncol setting specifying the number of rows and columns that facets should be arranged into

  • The free.coords parameter specifying if each map has its own bounding box

  • In addition to their utility for showing changing spatial relationships, faceted maps are also useful as the foundation for animated maps

Animated maps

Facets become tiny when there are many of them. Furthermore, the fact that each facet is physically separated on the screen or page means that subtle differences between facets can be hard to detect. Animated maps solve these issues. Although they depend on digital publication, this is becoming less of an issue as more and more content moves online. Animated maps can still enhance paper reports: you can always link readers to a web-page containing an animated (or interactive) version of a printed map to help make it come alive.

There are several ways to generate animations in R, including with animation packages such as gganimate, which builds on ggplot2. This section focusses on creating animated maps with tmap because its syntax will be familiar from previous sections and the flexibility of the approach.

urb_anim = tm_shape(world) + tm_polygons() + 
  tm_shape(urban_agglomerations) + tm_dots(size = "population_millions") +
  tm_facets(along = "year", free.coords = FALSE)

There are two differences, however, related to arguments in tm_facets():

  • along = “year” is used instead of by = “year”.
  • free.coords = FALSE, which maintains the map extent for each map iteration.

The resulting urb_anim represents a set of separate maps for each year. The final stage is to combine them and save the result as a .gif file with tmap_animation().

tmap_animation(urb_anim, filename = "urb_anim.gif", delay = 20,dpi=50)

Interactive maps

While static and animated maps can enliven geographic datasets, interactive maps can take them to a new level. Interactivity can take many forms, the most common and useful of which is the ability to pan around and zoom into any part of a geographic dataset overlaid on a ‘web map’ to show context.

Less advanced interactivity levels include popups which appear when you click on different features, a kind of interactive label. More advanced levels of interactivity include the ability to tilt and rotate maps, as demonstrated in the mapdeck example below, and the provision of “dynamically linked” sub-plots which automatically update when the user pans and zooms.

The most important type of interactivity, however, is the display of geographic data on interactive or ‘slippy’ web maps. The release of the leaflet package in 2015 revolutionized interactive web map creation from within R and a number of packages have built on these foundations adding new features (e.g., leaflet.extras) and making the creation of web maps as simple as creating static maps (e.g., tmap).

A unique feature of tmap is its ability to create static and interactive maps using the same code. Maps can be viewed interactively at any point by switching to view mode, using the command tmap_mode(“view”)

tmap_mode("view")
map_nz

Notable features of this interactive mode include the ability to specify the basemap with tm_basemap() (or tmap_options()) as demonstrated below (result not shown):

tm_shape(nz) +
  tm_borders() + tm_basemap(server = "OpenTopoMap")

An impressive and little-known feature of tmap’s view mode is that it also works with faceted plots The argument sync in tm_facets() can be used in this case to produce multiple maps with synchronized zoom and pan settings.

The coffee_data dataset contains data on the world coffee production data.

head(coffee_data,3)

Notice the country name column name_long is the same as in the world dataset:

head(world,3)

So we can use the Polygom maps of world to map the evolution of the world coffee production.

# We left join both datasets keeping all observations in left dataset
world_coffee = left_join(world, coffee_data, by = "name_long")
# Create two facets for 2016 and 2017
facets = c("coffee_production_2016", "coffee_production_2017")
tm_shape(world_coffee) + tm_polygons(facets) + 
  tm_facets(nrow = 1, sync = TRUE)

Switch tmap back to plotting mode with the same function: tmap_mode(“plot”)

tmap_mode("plot")
# tmap mode set to plotting

Quick interactive maps mapview()

If you are not proficient with tmap, the quickest way to create interactive maps may be with mapview. The following ‘one liner’ is a reliable way to interactively explore a wide range of geographic data formats:

library(mapview)
mapview::mapview(nz)

mapview has a concise syntax yet is powerful. By default, it provides some standard GIS functionality such as mouse position information, attribute queries (via pop-ups), scale bar, and zoom-to-layer buttons.

It offers advanced controls including the ability to ‘burst’ datasets into multiple layers and the addition of multiple layers with + followed by the name of a geographic object. Additionally, it provides automatic coloring of attributes (via argument zcol).

Given that mapview always expects a spatial object (sf, Spatial, Raster) as its first argument, it works well at the end of piped expressions. Consider the following example where sf is used to intersect lines and polygons and then is visualized with mapview.

nz %>% mapview(color = "red", lwd = 3, layer.name = "Land Area") +
  mapview(nz, zcol = "Land_area")

One important thing to keep in mind is that mapview layers are added via the + operator (similar to ggplot2 or tmap).For further information on mapview, see the package’s website

Leaflet

Last but not least is leaflet which is the most mature and widely used interactive mapping package in R. leaflet provides a relatively low-level interface to the Leaflet JavaScript library and many of its arguments can be understood by reading the documentation of the original JavaScript library

Leaflet maps are created with leaflet(), the result of which is a leaflet map object which can be piped to other leaflet functions.

This allows multiple map layers and control settings to be added interactively, as demonstrated in the code below:

  • The dataset cycle_hire is spatial Data (sf dataframe) that contains the Cycle hire points in London.
head(cycle_hire)
  • id Id of the hire point

  • name Name of the point

  • area Area they are in

  • nbikes The number of bikes currently parked there

  • nempty The number of empty places

  • geometry sfc_POINT

pal = colorNumeric("RdYlBu", domain = cycle_hire$nbikes)
# This is just a pallet of colors with a domain nbikes: 
#The number of bikes currently parked there

#Notice we can use the pipes to add layers
leaflet(data = cycle_hire) %>% 
  addProviderTiles(providers$CartoDB.Positron) %>% # Add a tile layer from a leaflet map provider
  addCircles(col = ~pal(nbikes), opacity = 0.9) %>% #Circles with pallete according to nbikes
  addLegend(pal = pal, values = ~nbikes) %>%  # Top right legend using the pallet
  setView(lng = -0.1, 51.5, zoom = 12) %>%  # Initial view
  addMiniMap() # Adds the minimap in the bottom right corner (default)

Other mapping packages

ggplot

Since version 2.3.0, the tidyverse plotting package ggplot2 has supported sf objects with geom_sf(). The syntax is similar to that used by tmap: an initial ggplot() call is followed by one or more layers, that are added with + geom_*(), where * represents a layer type such as geom_sf() (for sf objects) or geom_points() (for points).

ggplot2 plots graticules by default. The default settings for the graticules can be overridden using scale_x_continuous(), scale_y_continuous() or coord_sf(datum = NA).

Other notable features include the use of unquoted variable names encapsulated in aes() to indicate which aesthetics vary and switching data sources using the data argument, as demonstrated in the code chunk below:

library(ggplot2)
g1 = ggplot() + geom_sf(data = nz, aes(fill = Median_income)) +
  scale_x_continuous(breaks = c(170, 175))
g1

An advantage of ggplot2 is that it has a strong user-community and many add-on packages. Good additional resources can be found in the open source ggplot2 book (Wickham 2016) and in the descriptions of the multitude of ‘ggpackages’ such as ggrepel and tidygraph.

Another benefit of maps based on ggplot2 is that they can easily be given a level of interactivity when printed using the function ggplotly() from the plotly package.

plotly::ggplotly(g1)

At the same time, ggplot2 has a few drawbacks. The geom_sf() function is not always able to create a desired legend to use from the spatial data. Raster objects are also not natively supported in ggplot2 and need to be converted into a data frame before plotting.

Mexico’s geodatabase from INEGI

# State level map
mapa_mex_edos <- st_read("mg_sep2019_integrado/conjunto_de_datos/00ent.shp", 
                         options = "ENCODING=WINDOWS-1252")
## options:        ENCODING=WINDOWS-1252 
## Reading layer `00ent' from data source `/Users/Soporte/Desktop/Code/R Pro/Maps/mg_sep2019_integrado/conjunto_de_datos/00ent.shp' using driver `ESRI Shapefile'
## Simple feature collection with 32 features and 3 fields
## geometry type:  MULTIPOLYGON
## dimension:      XY
## bbox:           xmin: 911292 ymin: 319149.1 xmax: 4082997 ymax: 2349615
## projected CRS:  MEXICO_ITRF_2008_LCC
# Municipal level map
mapa_mex_mun <- st_read("mg_sep2019_integrado/conjunto_de_datos/00mun.shp", 
                        options = "ENCODING=WINDOWS-1252")
## options:        ENCODING=WINDOWS-1252 
## Reading layer `00mun' from data source `/Users/Soporte/Desktop/Code/R Pro/Maps/mg_sep2019_integrado/conjunto_de_datos/00mun.shp' using driver `ESRI Shapefile'
## Simple feature collection with 2465 features and 4 fields
## geometry type:  MULTIPOLYGON
## dimension:      XY
## bbox:           xmin: 911292 ymin: 319149.1 xmax: 4082997 ymax: 2349615
## projected CRS:  MEXICO_ITRF_2008_LCC
ENIGH <- read_csv(
  "https://raw.githubusercontent.com/diego-eco/diego-eco.github.io/master/downloads/ENIGH_concentradohogar_2018.csv")
# We select only three variables and create an additional one for the identifier by State "CVE_ENT"
ENIGH1 <- ENIGH %>% 
  dplyr::select(ubica_geo, factor, tot_integ, ing_cor) %>%
  mutate(CVE_ENT = 0)

# We extract the identifiers of each state from the locate_geo column and save them in CVE_ENT. 
# Note that we add a 0 when the State identifier does not start at 0

for (j in 1:nrow(ENIGH1)) {
  if( nchar(ENIGH1$ubica_geo[j] ) < 5){
  ENIGH1$CVE_ENT[j] <- paste0("0", substr(ENIGH1$ubica_geo[j], start = 1, stop = 1)) 
  } else { 
  ENIGH1$CVE_ENT[j] <- substr(ENIGH1$ubica_geo[j], start = 1, stop = 2)
  }
}
# We convert the CVE_ENT column into a factor to match that of the base mapa_mex_edos
ENIGH1$CVE_ENT <- as.factor(ENIGH1$CVE_ENT)
# We calculate the State weighted average income and store it on a new dataframe
ENIGH_edos <- ENIGH1 %>% group_by(CVE_ENT) %>%
  summarize(Ing_prom = weighted.mean(ing_cor, factor), 
            Poblacion = sum(factor*tot_integ))
# Replicate the original base to be able to use it later
mapeo_edos <- mapa_mex_edos

# Paste the data from "ENIGH_edos" to the dataframe "mapping_edos" with the inner_join function
mapeo_edos <- inner_join(mapa_mex_edos, ENIGH_edos, 
                   by = "CVE_ENT")
# Backup the spatial dataset 
mapeo_mun <- mapa_mex_mun

# Create a database with ENIGH data at the municipal level
ENIGH_muni <- ENIGH1 %>% group_by(ubica_geo) %>%
  summarize(Ing_prom = weighted.mean(ing_cor, factor), 
            Poblacion = sum(factor*tot_integ)) %>% 
  mutate(CVEGEO = 0) 

# We extract the identifiers of each state from the ubica_geo column and store them in the CVEGEO column
for (j in 1:nrow(ENIGH_muni)) {
  if( nchar(ENIGH_muni$ubica_geo[j] ) < 5){
  ENIGH_muni$CVEGEO[j] <- paste0("0", substr(ENIGH_muni$ubica_geo[j], start = 1, stop = 5)) 
  } else { 
  ENIGH_muni$CVEGEO[j] <- substr(ENIGH_muni$ubica_geo[j], start = 1, stop = 5)
  }
}

# We convert the CVE_ENT column into a factor to match that of the base mapa_mex_mun
ENIGH_muni$CVEGEO <- as.factor(ENIGH_muni$CVEGEO)
# There are only observations for 996 municipalities, 
# so we paste the ENIGH_muni data to the mapping_mun dataframe with the left_join function
mapeo_mun <- left_join(mapa_mex_mun, ENIGH_muni, 
                   by = "CVEGEO")

Mapa CDMX ingreso promedio por delegación

mapa_cdmx <- mapeo_mun %>% filter(CVE_ENT == "09")
library(mapview)
mapa_cdmx %>% mapview(color = "white", lwd = 3, layer.name = "CDMX") +
  mapview(mapa_cdmx, zcol = "Ing_prom") 

Mapa CDMX habitantes por delegación

mapa_cdmx %>% mapview(color = "white", lwd = 3, layer.name = "CDMX") +
  mapview(mapa_cdmx, zcol = "Poblacion")

Mapa CDMX ingreso promedio y habitantes por delegación

mapa_cdmx %>% mapview(color = "white", lwd = 3, layer.name = "CDMX") +
  mapview(mapa_cdmx, zcol = "Poblacion") + mapview(mapa_cdmx, zcol = "Ing_prom")

  1. El Colegio de México, ↩︎